16 research outputs found
Efficient posterior sampling for high-dimensional imbalanced logistic regression
High-dimensional data are routinely collected in many areas. We are
particularly interested in Bayesian classification models in which one or more
variables are imbalanced. Current Markov chain Monte Carlo algorithms for
posterior computation are inefficient as and/or increase due to
worsening time per step and mixing rates. One strategy is to use a
gradient-based sampler to improve mixing while using data sub-samples to reduce
per-step computational complexity. However, usual sub-sampling breaks down when
applied to imbalanced data. Instead, we generalize piece-wise deterministic
Markov chain Monte Carlo algorithms to include importance-weighted and
mini-batch sub-sampling. These approaches maintain the correct stationary
distribution with arbitrarily small sub-samples, and substantially outperform
current competitors. We provide theoretical support and illustrate gains in
simulated and real data applications.Comment: 4 figure
Posterior computation with the Gibbs zig-zag sampler
An intriguing new class of piecewise deterministic Markov processes (PDMPs)
has recently been proposed as an alternative to Markov chain Monte Carlo
(MCMC). In order to facilitate the application to a larger class of problems,
we propose a new class of PDMPs termed Gibbs zig-zag samplers, which allow
parameters to be updated in blocks with a zig-zag sampler applied to certain
parameters and traditional MCMC-style updates to others. We demonstrate the
flexibility of this framework on posterior sampling for logistic models with
shrinkage priors for high-dimensional regression and random effects and provide
conditions for geometric ergodicity and the validity of a central limit
theorem.Comment: 29 pages, 4 figure
Clustering Multiple Sclerosis Medication Sequence Data with Mixture Markov Chain Analysis with covariates using Multiple Simplex Constrained Optimization Routine (MSiCOR)
Multiple sclerosis (MS) is an autoimmune disease of the central nervous
system that causes neurodegeneration. While disease-modifying therapies (DMTs)
reduce inflammatory disease activity and delay worsening disability in MS,
there are significantly varying treatment responses across people with MS
(pwMS). pwMS often receive serial monotherapies of DMTs. Here, we propose a
novel method to cluster pwMS according to the sequence of DMT prescriptions and
associated clinical features (covariates). This is achieved via a mixture
Markov chain analysis with covariates, where the sequence of prescribed DMTs
for each patient is modeled as a Markov chain. Given the computational
challenges to maximize the mixture likelihood on the constrained parameter
space, we develop a pattern search-based global optimization technique which
can optimize any objective function on a collection of simplexes and shown to
outperform other related global optimization techniques. In simulation
experiments, the proposed method is shown to outperform the
Expectation-Maximization (EM) algorithm based method for clustering sequence
data without covariates. Based on the analysis, we divided MS patients into 3
clusters: inferon-beta dominated, multi-DMTs, and natalizumab dominated.
Further cluster-specific summaries of relevant covariates indicate patient
differences among the clusters. This method may guide the DMT prescription
sequence based on clinical features
SEQUENTIAL MONTE CARLO FOR BAYESIAN INFERENCE AND DATA ASSIMILATION
Ph.DDOCTOR OF PHILOSOPH